Diagnostic Performance of Two-Dimensional Ultrasound, Two-Dimensional Sonohysterography and Three-Dimensional Ultrasound in the Diagnosis of Septate Uterus—A Systematic Review and Meta-Analysis

Background: The septate uterus is the most common congenital uterine anomaly, and hysteroscopy is the gold standard for diagnosing it. The goal of this meta-analysis is to perform a pooled analysis of the diagnostic performance of two-dimensional transvaginal ultrasonography, two-dimensional transvaginal sonohysterography, three-dimensional transvaginal ultrasound, and three-dimensional transvaginal sonohysterography for the diagnosis of the septate uterus. Methods: Studies published between 1990 and 2022 were searched in PubMed, Scopus, and Web of Science. From 897 citations, we selected eighteen studies to include in this meta-analysis. Results: The mean prevalence of uterine septum in this meta-analysis was 27.8%. Pooled sensitivity and specificity were 83% and 99% for two-dimensional transvaginal ultrasonography (ten studies), 94% and 100% for two-dimensional transvaginal sonohysterography (eight studies), and 98% and 100% for three-dimensional transvaginal ultrasound (seven articles), respectively. The diagnostic accuracy of three-dimensional transvaginal sonohysterography was only described in two studies, and we did not calculate the pooled sensitivity and specificity for this method. Conclusion: Three-dimensional transvaginal ultrasound has the best performance capacity for the diagnosis of the septate uterus.


Introduction
Congenital uterine anomalies (CUAs) of the genital tract are the result of abnormal formation, canalization, or fusion of the paramesonephric ducts or the defective absorption of the midline septum during fetal life [1]. Prevalence in low-risk populations is difficult to assess, mostly because diagnostic methods are rarely applied in asymptomatic populations

Protocol and Registration
This meta-analysis was conducted following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-analysis) recommendations and the SEDATE (Synthesizing Evidence from Diagnostic Accuracy Tests) guidelines [17,18]. Inclusion and exclusion criteria were defined prior to starting data research. The study protocol was not registered in PROSPERO. There was no need for an ethics committee's approval given the nature and design of this study.
For this meta-analysis, we considered as septate uteri those cases reported as septate (noted as complete or partial "subseptate" in many studies) in the primary studies included.

Data Sources and Searches
Five authors screened three electronic databases (PubMed/Medline, Scopus, and Web of Science). The period was set between January 1990 and November 2022. The language was set to English only. The terms used for both searches were as follows: "uterine anomalies," "müllerian anomalies," "transvaginal ultrasound," and "sonohysterography."

Study Selection and Data Collection
In collaboration, the five authors combined the searches in different databases and excluded duplicated articles. In the next step, we filtered the titles first and the abstracts second to identify irrelevant articles to exclude, such as those not strictly related to the topic under review or non-observational studies (i.e., reviews, case reports, and letters to the editor). Records were then filtered again with a complete reading of the full text of the studies that remained after exclusions.
This meta-analysis had the following inclusion criteria: -Prospective or retrospective observational cohort studies including women diagnosed with uterine Müllerian anomalies using any of the following methods as index tests: 2D transvaginal ultrasound, 2D sonohysterography, 3D transvaginal ultrasound, or 3D sonohysterography. -Data reported that allows for the construction of a 2 × 2 table to estimate true positive, true negative, false positive, and false negative cases for any of the index tests assessed. -Hysteroscopy, with or without combined laparoscopy, as the reference standard - The exclusion criteria were: -Studies not related to the topic -Articles not reporting specific data regarding the septate uterus (complete or partial) -Letters to the editor, commentaries, narrative reviews, consensus documents, and any other study that does not provide enough data to construct a 2 × 2 table -Hysteroscopy with or without combined laparoscopy is not used as the gold standard for the diagnosis of uterine Müllerian anomalies.
If the dates of two cohort studies published by the same authors overlapped, we excluded the first one in order to avoid the inclusion of duplicate cohorts. We did not contact the authors. We used the snowball strategy to identify potential interesting papers by reading the reference lists of the papers selected for full text reading.
Five authors independently retrieved the following data from each study: first author, year of publication, country, study design, number of centers participating, patients' inclusion criteria, patients' exclusion criteria, patients' age, number of patients, number of patients with septate uterus, index test used, definition of septate used, number of examiners, whether the examiner was blinded or not to the reference standard, the reference standard used, the diagnostic accuracy results, and the time elapsed from the index test to the reference standard test. Disagreements arising during the process of study selection and data extraction were resolved by consensus among these five authors.

Risk of Bias in Individual Studies
The quality assessment of the studies included in the meta-analysis was conducted using the tool provided by the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) [19]. The QUADAS-2 format includes four domains: patient selection, index test, reference standard, and flow and timing. For each domain, the risk of bias and concerns about applicability (not applying to the domain of flow and timing) were analyzed and rated as low, high, or unclear risk. Five authors independently evaluated the methodological quality. Disagreements were solved by discussion between these authors.
The assessment of the quality was based on whether the article described the study's design, inclusion and exclusion criteria, if the operators were blinded, whether the study reported on how the index test was performed and interpreted, which was the reference standard used, and a description of the time elapsed from the index test assessment to the reference standard result. Unclear risk was stated when the corresponding information for each domain was not reported in the study. The arcuate uterus was considered a variant of normality and, therefore, was not considered a Müllerian anomaly. For the papers that studied the arcuate uterus separately, authors included them in the non-septate group.
Surgery, including hysteroscopy and/or laparoscopy, was defined as the reference standard. The fact that surgeons were not blinded to ultrasound findings was not considered to pose a high risk of bias. For the flow-and-timing domain, we considered a high risk of bias as time elapsed from the index test assessment to the reference standard. We decided to consider all studies that reported this data as low-risk since the uterine septum is present, by definition, since patient birth and it is assumed that it does not change over time.

Statistical Analysis
Heterogeneity for sensitivity and specificity was assessed using Cochran's Q statistic and the I 2 index. A p-value < 0.1 indicates heterogeneity. I 2 values of 25%, 50%, and 75% would be considered to indicate low, moderate, and high heterogeneity, respectively [20]. Forest plots of the sensitivity and specificity of all studies were plotted. Meta-regression was used if heterogeneity existed to assess covariates that could explain this heterogeneity. The co-variates analyzed for meta-regression were year of publication, sample size (n), uterine septum prevalence, classification type used, and type of study design.
Summary receiver operating characteristic (sROC) curves were plotted to illustrate the relationship between sensitivity and specificity, and the area under the curve (AUC) was calculated. Publication bias was assessed using Deek's method [21].
Statistical analysis was performed using STATA version 12.0 for Windows (Stata Corporation, College Station, TX, USA). A p-value < 0.05 was considered statistically significant.
The risk of biased evaluation and concerns regarding the applicability of the selected studies can be seen in Table 2.
Concerning the domain "index test," three studies were considered to have a high risk of bias because the definition applied to describe the uterus septate was not adequate. The definitions "myometrial echoes divided the fundal endometrial image in the transverse plane" [22] and "abnormal uterine cavity shape" [23] do not distinguish between uterus septate or uterus didelphys/bicorporeal, and "abnormal uterine contour" [27] does not occur only in the uterine septum. Four other studies were of unclear risk of bias for the "index test" because the authors did not explain what uterine septum diagnostic criteria were used [25,26,29,36]. All the studies were considered low-risk for the domain "reference standard," since all of them use hysteroscopy with or without laparoscopy as a reference standard to detect the uterine septum as determined in our inclusion criteria.
For the analysis of concerns about applicability, one study [23] was considered to have a high risk of bias for "patient selection" because women with abnormal uterine bleeding are not an adequate target population to investigate the uterine septum. All the other studies' domains were considered low-risk.
Meta-regression showed that for 2D TVS, uterine septum prevalence explained the heterogeneity observed in uterine septum diagnosis (p < 0.05), but the publication year and sample size did not. Regarding 2D SIS, the year of publication, sample size, and uterine septum prevalence did not explain the heterogeneity observed in diagnostic performance.
The ROC curves for the diagnostic performance of the 2D TVS and 2D SIS to detect The ROC curves for the diagnostic performance of the 2D TVS and 2D SIS to detect uterine septum are shown in Figures 4 and 5, respectively. The area under the curve regarding 2D TVS was 0.98 (95% CI 0.96-0.99). For 2D SIS, the area under the curve was 1.00 (95% CI 0.99-1.00). Fagan's monogram showed that 2D TVS increased the pre-test probability of a uterine septum from 23% to 97% and decreased to 5%, with an LR+ and LR− of 107 and 0.18, respectively ( Figure 6). On the other hand, 2D SIS increased the pre-test probability of a uterine septum from 27% to 99% and decreased to 2%, with an LR+ and LR− of 455 and 0.06, respectively (Figure 7). We did not find publication bias referred to 2D TVS (p = 0.71) and 2D SIS (p = 0.06) (Figures 8 and 9, respectively).

Diagnostic Performance of 3D TVS and 3D SIS for Uterine Septum Detection
There are only two articles [13,33] describing the diagnostic accuracy of 3D SIS, and, for this reason, we cannot calculate the pooled sensitivity and specificity of this method.
The ROC curve for the diagnostic performance of the 3D TVS to detect uterine septum is shown in Figure 11. The area under the curve was 0.99 (95% CI 0.94-1.00). Fagan's monogram showed that 3D TVS increased the pre-test probability of a uterine septum from 67% to 100% and decreased to 5% (Figure 12), with an LR+ and LR− of 504 and 0.02, respectively. We did not find publication bias regarding 3D TVS (p = 0.41), as shown in Figure 13.

Diagnostic Performance of 3D TVS and 3D SIS for Uterine Septum Detection
There are only two articles [13,33] describing the diagnostic accuracy of 3D SIS, and, for this reason, we cannot calculate the pooled sensitivity and specificity of this method.
The ROC curve for the diagnostic performance of the 3D TVS to detect uterine septum is shown in Figure 11. The area under the curve was 0.99 (95% CI 0.94-1.00). Fagan's monogram showed that 3D TVS increased the pre-test probability of a uterine septum from 67% to 100% and decreased to 5% (Figure 12), with an LR+ and LR− of 504 and 0.02, respectively. We did not find publication bias regarding 3D TVS (p = 0.41), as shown in Figure 13.

Summary of Evidence
In the present study, we performed a meta-analysis of the diagnostic performance of the 2D TVS, 2D TV SIS, and 3D TVS in the detection of the septate uterus, compared to the actual gold standard: the hysteroscopy and/or laparoscopy. We found 18 studies composed of 3737 patients with available data for analysis. The prevalence of the septate uterus was 27.8% (1039 cases) in a population of mostly infertile patients.

Limitations and Strengths
The main strength of our study is that this meta-analysis is the first to address this issue. We believe that the methodology used is correct.
As limitations of our meta-analysis, we consider that the number of studies was low, as well as the quality of some papers; therefore, the results should be taken with caution. The characteristics of the participants included are part of this statement. The majority of the patients had a history of infertility, which we consider to be the most relevant group of women to study, although there is one article that used any patient for whom a hysteroscopy was indicated (due to infertility, but also abnormal uterine bleeding or cervical polyps). Furthermore, a sub-analysis of these patients was not performed.
Additionally, seven studies only recruited patients with a suspected mullerian anomaly based on a previous 2D TVS evaluation [13,25,28,31,34,35,38] and another two excluded patients older than 40 years [25,36], a decision that we consider a selection bias. Another limitation was the lack of uniformity of ultrasound definitions to classify the CUAs, which could lead to unnecessary inclusions or exclusions in the studies.

Interpretation of Results
According to our data, 2D SIS showed a better sensitivity than 2D TVS (94% vs. 83%) and almost the same specificity (99% vs. 100%). On the other hand, 3D TVS has the best performance capacity for the diagnosis of the septate uterus (98% sensitivity and 100% specificity).
2D SIS has demonstrated its utility for the evaluation of the uterine cavity; it is accessible to most centers and has mild and uncommon side effects [39]. Therefore, in a center without a 3D TVS, the 2D SIS is a feasible and accessible option for the diagnosis of the septate uterus. Both techniques are significantly less expensive than others, such as an MRI.
As stated above, the prevalence of the septate uterus in this meta-analysis was 27.8%. However, the studies included did not use the same definition criteria because, to date, there is not a universally accepted classification system for CUAs, which is problematic because, depending on the definition used, there is a different prevalence of findings. Over time, some scientific societies have made efforts to solve this issue. However, there is no consensus on this topic, and the different definitions and their impact on diagnosis have been proven in several studies [40][41][42][43].
Congenital Uterine Malformation by Experts (CUME) published an article in 2018 with the goal of assessing the level of agreement between experts in distinguishing a septate uterus from a normal or arcuate uterus. It was discovered that, over a series of patients, a uterus was defined as having a septum seven times more frequently using the ESHRE-ESGE criteria than the ASRM one. According to this statement, the ESHRE-ESGE criteria overdiagnose the septate uterus, while the ASRM criteria underdiagnose it [40]. CUME defines new cut-off values for indentation depth (≥10 mm), indentation angle (< 140 • ), and I:WT ratio (>110%) and suggests using internal indentation depth to distinguish between normal/arcuate and septate uteri because of its simplicity and reliability [40]. Therefore, the decision to treat a septate uterus that might also be classified as normal is challenging for the physician. This problem also has repercussions for evaluating the clinical impact of different CUAs; thus, comparisons between management are limited.

Future Research Agenda
There is a need for better quality studies that use uniform definition criteria in a larger series of patients to definitively present the performance of the different diagnosis methods, maybe including others such as the MRI. We consider that a cost-efficiency analysis could be performed in order to determine recommendations in this area.

Conclusions
Regarding these results and because the 2D TV SIS is an invasive method, we can conclude that 3D US is the best method for the diagnosis of the septate uterus, but as this equipment is not available in all centers, 2D SIS could be a reasonable option. Institutional Review Board Statement: Institutional Review Board (IRB) was waived due to study's design.